Credit Card Holders' Behavior Modeling: Transition Probability Prediction with Multinomial and Conditional Logistic Regression in SAS/STAT®

نویسندگان

  • Denys Osipenko
  • Jonathan Crook
چکیده

Because of the variety of card holders‟ behavior patterns and income sources, each consumer account can change to different states. Each consumer account can change to states such as non-active, transactor, revolver, delinquent, and defaulted, and each account requires an individual model for generated income prediction. The estimation of the transition probability between statuses at the account level helps to avoid the lack of memory in the MDP approach. The key question is which approach gives more accurate results: multinomial logistic regression or multistage decision tree with binary logistic regressions. This paper investigates the approaches to credit cards' profitability estimation at the account level based on multistates conditional probability by using the SAS/STAT procedure PROC LOGISTIC. Both models show moderate, but not strong, predictive power. Prediction accuracy for decision tree is dependent on the order of stages for conditional binary logistic regression. Current development is concentrated on discrete choice models as nested logit with PROC MDC. INTRODUCTION Credit card profitability prediction is a complex problem because of variety of the card holders‟ behaviour patterns and different sources of the interest and transactional income. Each consumer account can move to a number of states like „transactor‟, „revolver‟, and „delinquent‟ and requires an individual model for generated income prediction. Credit cards modelling to be more reliable and accurate need to take into account revolving products dual nature both as standard loan and payment tool. Thus scoring models should be split up according to customer behaviour segment and source of generated income for the bank. The state of the credit card depends on the type of card usage and payments delinquency. Thus 5 states can be defined: inactive, transactor, revolver, delinquent, default. The estimation of status transition probability on account level helps to avoid the memorylessness property of Markov Chains approach. Proposed credit cards profit prediction model consists of five stages: account or consumer status prediction with conditional transition probabilities, outstanding balance and interest income estimation, non-interest income estimation, expected losses estimation, and profit estimation. The main question of this paper is which approach to prediction of multistates transition probability gives more accurate results: multinomial logistic regression or decision tree with conditional binary logistic regressions. The first stage of the profit estimation model is the determination of the account status via transition probabilities on account level. Two approaches to predict the status have been investigated: i) conditional logistic regression as Bayesian network, ii) multinomial logistic regression. This paper describes an approach to credit cards profitability estimation on account level based on multistates conditional probabilities model. The empirical investigation presents the comparative analysis of multinomial logistic regression and conditional probabilities model in application to credit card holders behaviour modelling.. GENERAL MODEL SETUP At the high level credit card holder can be non-active, active, delinquent and defaulted. Active and nondelinquent credit cards holders are split up into two groups: revolvers and transactors. Revolver is user who carries a positive credit card balance and not pay off the balance in full each month – roll over.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Comparative Analysis of Predictive Models for Credit Limit Utilization Rate with SAS/STAT®

Credit card usage modelling is a relatively innovative task of client predictive analytics compared to risk modelling such as credit scoring. The credit limit utilization rate is a problem with limited outcome values and highly dependent on customer behavior. Proportion prediction techniques are widely used for Loss Given Default estimation in credit risk modelling (Belotti and Crook, 2009; Ars...

متن کامل

Improving Credit Scoring by Generalized Additive Model

Logistic Regression has been widely used in the financial service industry for credit scoring models. Despite its advantages in easy interpretation and low computing cost, Logistic Regression is under the criticism of failure to model the nonlinear features of the predictors effect on the dependent variable and therefore might lead to unsatisfactory results. Modern statistical techniques such a...

متن کامل

An Empirical Analysis of Credit Card Customers’ Overdue Risks for Medium- and Small-Sized Commercial Bank in Taiwan

This paper constructs a multiple regression model to evaluate the overdue risk of credit card holders. The results can identify the factors influencing the credit card holders’ overdue risk behavior in order to provide card issuing banks a decision-making reference in the investigation of credit card holders’ related characteristics and the relationship quality between the credit card holders a...

متن کامل

Performing Exact Logistic Regression with the SAS System — Revised 2009

Exact logistic regression has become an important analytical technique, especially in the pharmaceutical industry, since the usual asymptotic methods for analyzing small, skewed, or sparse data sets are unreliable. Inference based on enumerating the exact distributions of sufficient statistics for parameters of interest in a logistic regression model, conditional on the remaining parameters, is...

متن کامل

A generic nomogram for multinomial prediction models: theory and guidance for construction

Background: The use of multinomial logistic regression models is advocated for modeling the associations of covariates with three or more mutually exclusive outcome categories. As compared to a binary logistic regression analysis, the simultaneous modeling of multiple outcome categories using a multinomial model often better resembles the clinical setting, where a physician typically must disti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015